combinatorial bayesian optimization
Combinatorial Bayesian Optimization using the Graph Cartesian Product
This paper focuses on Bayesian Optimization (BO) for objectives on combinatorial search spaces, including ordinal and categorical variables. Despite the abundance of potential applications of Combinatorial BO, including chipset configuration search and neural architecture search, only a handful of methods have been proposed. We introduce COMBO, a new Gaussian Process (GP) BO.
Reviews: Combinatorial Bayesian Optimization using the Graph Cartesian Product
This manuscript proposes a system for combinatorial Bayesian optimization called COMBO, aimed at problems with large numbers of categorical and/or ordinal features. The main contribution is an effective kernel for this setting based on applying a graph kernel to the graph Cartesian product of each of the features, which can be computed efficiently by exploiting structure. This kernel can be further enhanced using an ARD extension and a horseshoe prior to encourage sparse feature selection. The COMBO system then creates a GP with this kernel and does random local search to maximize an acquisition function such as EI in the combinatorial space. A series of experiments demonstrate COMBO performing better on real and synthetic tasks than alternatives such as systems using one-hot encodings.
Combinatorial Bayesian Optimization using the Graph Cartesian Product
This paper focuses on Bayesian Optimization (BO) for objectives on combinatorial search spaces, including ordinal and categorical variables. Despite the abundance of potential applications of Combinatorial BO, including chipset configuration search and neural architecture search, only a handful of methods have been pro- posed. We introduce COMBO, a new Gaussian Process (GP) BO. The vertex set of the combinatorial graph consists of all possible joint assignments of the variables, while edges are constructed using the graph Cartesian product of the sub-graphs that represent the individual variables. On this combinatorial graph, we propose an ARD diffusion kernel with which the GP is able to model high-order interactions between variables leading to better performance.
Random Postprocessing for Combinatorial Bayesian Optimization
Morita, Keisuke, Nishikawa, Yoshihiko, Ohzeki, Masayuki
Sigma-i Co., Ltd., Tokyo, 108-0075, Japan Model-based sequential approaches to discrete "black-box" optimization, including Bayesian optimization techniques, often access the same points multiple times for a given objective function in interest, resulting in many steps to find the global optimum. Here, we numerically study the effect of a postprocessing method on Bayesian optimization that strictly prohibits duplicated samples in the dataset. We find the postprocessing method significantly reduces the number of sequential steps to find the global optimum, especially when the acquisition function is of maximum a posteriori estimation. Our results provide a simple but general strategy to solve the slow convergence of Bayesian optimization for high-dimensional problems. This process is repeated until a termination criterion is fulfilled, e.g., exhausting the predeter-1/10 In recent years, several attempts have been made to apply Bayesian optimization to highdimensional combinatorial optimization problems.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.25)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Japan > Honshū > Tōhoku (0.04)
- Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
Combinatorial Bayesian Optimization with Random Mapping Functions to Convex Polytope
Kim, Jungtaek, Cho, Minsu, Choi, Seungjin
Bayesian optimization is a popular method for solving the problem of global optimization of an expensive-to-evaluate black-box function. It relies on a probabilistic surrogate model of the objective function, upon which an acquisition function is built to determine where next to evaluate the objective function. In general, Bayesian optimization with Gaussian process regression operates on a continuous space. When input variables are categorical or discrete, an extra care is needed. A common approach is to use one-hot encoded or Boolean representation for categorical variables which might yield a {\em combinatorial explosion} problem. In this paper we present a method for Bayesian optimization in a combinatorial space, which can operate well in a large combinatorial space. The main idea is to use a random mapping which embeds the combinatorial space into a convex polytope in a continuous space, on which all essential process is performed to determine a solution to the black-box optimization in the combinatorial space. We describe our {\em combinatorial Bayesian optimization} algorithm and present its regret analysis. Numerical experiments demonstrate that our method outperforms existing methods.
- Europe > Sweden > Stockholm > Stockholm (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- (7 more...)
- Research Report (0.64)
- Instructional Material (0.46)
Combinatorial Bayesian Optimization using the Graph Cartesian Product
Oh, Changyong, Tomczak, Jakub, Gavves, Efstratios, Welling, Max
This paper focuses on Bayesian Optimization (BO) for objectives on combinatorial search spaces, including ordinal and categorical variables. Despite the abundance of potential applications of Combinatorial BO, including chipset configuration search and neural architecture search, only a handful of methods have been pro- posed. We introduce COMBO, a new Gaussian Process (GP) BO. The vertex set of the combinatorial graph consists of all possible joint assignments of the variables, while edges are constructed using the graph Cartesian product of the sub-graphs that represent the individual variables. On this combinatorial graph, we propose an ARD diffusion kernel with which the GP is able to model high-order interactions between variables leading to better performance.
Combinatorial Bayesian Optimization using Graph Representations
Oh, Changyong, Tomczak, Jakub M., Gavves, Efstratios, Welling, Max
This paper focuses on Bayesian Optimization - typically considered with continuous inputs - for discrete search input spaces, including integer, categorical or graph structured input variables. In Gaussian process-based Bayesian Optimization a problem arises, as it is not straightforward to define a proper kernel on discrete input structures, where no natural notion of smoothness or similarity could be provided. We propose COMBO, a method that represents values of discrete variables as vertices of a graph and then use the diffusion kernel on that graph. As the graph size explodes with the number of categorical variables and categories, we propose the graph Cartesian product to decompose the graph into smaller sub-graphs, enabling kernel computation in linear time with respect to the number of input variables. Moreover, in our formulation we learn a scale parameter per subgraph. In empirical studies on four discrete optimization problems we demonstrate that our method is on par or outperforms the state-of-the-art in discrete Bayesian optimization.
- North America > Canada > Ontario > Toronto (0.14)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Africa > Sudan (0.04)